Fluency Over Adequacy: A Pilot Study in Measuring User Trust in Imperfect MT
نویسندگان
چکیده
Although measuring intrinsic quality has been a key factor in the advancement of Machine Translation (MT), successfully deploying MT requires considering not just intrinsic quality but also the user experience, including aspects such as trust. This work introduces a method of studying how users modulate their trust in an MT system after seeing errorful (disfluent or inadequate) output amidst good (fluent and adequate) output. We conduct a survey to determine how users respond to good translations compared to translations that are either adequate but not fluent, or fluent but not adequate. In this pilot study, users responded strongly to disfluent translations, but were, surprisingly, much less concerned with adequacy.
منابع مشابه
Stochastic Iterative Alignment for Machine Translation Evaluation
A number of metrics for automatic evaluation of machine translation have been proposed in recent years, with some metrics focusing on measuring the adequacy of MT output, and other metrics focusing on fluency. Adequacy-oriented metrics such as BLEU measure n-gram overlap of MT outputs and their references, but do not represent sentence-level information. In contrast, fluency-oriented metrics su...
متن کاملThe Expected Achievable Distortion of Two-User Decentralized Interference Channels
This paper concerns the transmission of two independent Gaussian sources over a two-user decentralized interference channel, assuming that the transmitters are unaware of the instantaneous CSIs. The availability of the channel state information at receivers (CSIR) is considered in two scenarios of perfect and imperfect CSIR. In the imperfect CSIR case, we consider a more practical assumption of...
متن کاملMT-EQuAl: a Toolkit for Human Assessment of Machine Translation Output
MT-EQuAl (Machine Translation Errors, Quality, Alignment) is a toolkit for human assessment of Machine Translation (MT) output. MT-EQuAl implements three different tasks in an integrated environment: annotation of translation errors, translation quality rating (e.g. adequacy and fluency, relative ranking of alternative translations), and word alignment. The toolkit is webbased and multi-user, a...
متن کاملCalibrating resource-light automatic MT evaluation
MT systems are traditionally evaluated with different criteria, such as adequacy and fluency. Automatic evaluation scores are designed to match these quality parameters. In this paper we introduce a novel parameter – usability (or utility) of output, which was found to integrate both fluency and adequacy. We confronted two automated metrics, BLEU and LTV, with new data for which human evaluatio...
متن کاملThrough the Eyes of VERTa
This paper describes a practical demo of VERTa for Spanish. VERTa is an MT evaluation metric that combines linguistic features at different levels. VERTa has been developed for English and Spanish but can be easily adapted to other languages. VERTa can be used to evaluate adequacy, fluency and ranking of sentences. In this paper, VERTa’s modules are described briefly, as well as its graphical i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.06041 شماره
صفحات -
تاریخ انتشار 2018